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1.  Introduction 


18^^-IDEO  surveillance  systems  have  become  increasingly  important  for  national  security.  Object  tracking  and  action 

19  recognition  are  two  important  parameters  for  any  such  surveillance  system.  Once  an  object  is  tracked  and  its  motion  has 

20  been  classified  into  a  standard  category  by  comparing  it  against  a  database  of  actions,  the  difficult  part  is  to  link  these 

2 1  actions  or  group  of  actions  spatio-temporally  to  discover  events  that  are  unusual  or  seek  attention.  In  many  cases,  such  linking  is 

22  done  by  human  operators  who  have  to  sit  continually  in  front  of  these  surveillance  cameras  and  keep  watching  for  unusual 

23  events.  However,  for  hours  and  hours  of  video  data,  this  becomes  a  Herculean  task  and  hence  calls  for  an  automated  system  that 

24  could  track  the  objects,  classify  the  motion,  and  reason  about  the  top  level  actions  in  these  surveillance  videos.  Although  many 

25  trackers  (Zhou,  2006)  and  motion  classifiers  (Junejo,  2008)  are  available  today,  none  of  them  have  the  ability  to  reason  about  the 

26  top  level  plans  involving  complex  events  like  burglary  or  escapade.  In  this  paper,  we  present  a  novel  approach  on  reasoning 

27  about  the  top  level  plans  combining  Linear  Temporal  Logic  and  Abduction  based  reasoning. 

28  The  rest  of  the  paper  is  organized  as  follows.  The  related  work  is  discussed  in  Section  1.1  while  section  2  contains  a  formal 

29  description  of  Linear  Temporal  Logic.  Section  3  contains  the  basics  of  our  approach  to  mapping  the  surveillance  video  frames  to 

30  LTL.  Section  4  discusses  abductive  reasoning  and  its  use  for  performing  probabilistic  computations  for  reasoning  about  complex 

31  events  in  surveillance  videos.  Section  5  illustrates  the  proposed  Bayesian  Framework  used  for  inference.  The  implementation 

32  details  and  results  are  illustrated  in  Section  6.  Section  7  concludes  the  paper  with  discussions  of  the  model  and  future  work. 

33 

34  1.1  Related  work 

35  A  theory  for  reasoning  about  actions  that  is  based  on  Dynamic  Linear  Time  Temporal  Logic  (DLTL)  is  proposed  in  (Giordano 

36  et  al.,  2001).  They  propose  an  approach  for  reasoning  about  actions  and  change  in  a  temporal  logic  by  modeling  the  temporal 

37  projection  problem  and  planning  problem  as  a  satisfiability  problem  in  DLTL.  Another  work  that  is  quite  related  to  our  work  is 

38  that  proposed  in  (Raghavan  and  Mooney,  2011)  on  abductive  plan  recognition  using  Bayesian  Logic  Programs  (BLPs). 

39  However,  their  work  is  based  on  Bayesian  logic  programs  whereas  we  use  a  different  approach  that  is  based  on  Linear  Temporal 

40  Logic  (LTL). 

41  Logical  reasoning  was  first  used  for  activity  recognition  in  (Kautz,  1987).  It  provided  a  formal  theory  of  plan  recognition 

42  describing  it  as  a  logical  inference  process  of  circumscription.  All  actions  and  plans  are  uniformly  referred  to  as  goals,  and  a 

43  recognizer's  knowledge  is  represented  by  a  set  of  first-order  statements  called  event  hierarchy  encoded  in  first-order  logic,  which 

44  defines  abstraction,  decomposition  and  functional  relationships  between  types  of  events.  However,  our  work  is  based  on  the  use 

45  of  LTL  in  portraying  the  temporal  relations  between  the  actions  in  event  space.  A  method  for  robbery  detection  was  proposed  in 

46  (Chuang,  2007)  that  primarily  focuses  on  baggage  detection  and  hence  might  raise  false  alarms  even  for  a  not  so  unusual  event 


47  like  a  normal  visit  to  a  store  or  bank.  Their  approach  lacks  the  ability  to  chain  multiple  activities  that  is  inherent  in  a  composite 

48  event  like  robbery. 

49  A  process  recognition  strategy  based  on  Linear  Temporal  Logic  is  proposed  in  (Kreutzmann  et  ah,  2011).  However,  it  is 

50  different  from  our  work  in  the  fact  that  we  combine  abductive  reasoning  with  LTL  to  reason  about  complex  events.  An  approach 

51  for  motion  classification  using  Motion  History  Image  was  proposed  in  (Ahad  et.  al.,  2010)  while  (Shao  et.  al,  2012)  proposed  a 

52  method  based  on  Motion  and  shape  analysis.  A  probabilistic  framework  for  plan  recognition  is  proposed  in  (Bui,  2003)  which  is 

53  based  on  Abstract  Hidden  Markov  Model.  However,  our  approach  is  distinct  and  novel  in  that  it  combines  LTL  and  abductive 

54  reasoning  to  detect  and  predict  complex  real-life  events  like  burglary  or  escapade  or  distinguishing  between  temporary  and 

55  permanent  parking  of  a  car  in  surveillance  videos. 

56 

57  2.  Linear  Temporal  Logic 

58  Linear  Temporal  Logic  (LTL)  is  a  modal  temporal  logic  with  modalities  referring  to  time.  It  is  used  to  encode  the  formulae 

59  about  future  of  paths  and  is  used  to  represent  real-world  entities  in  the  formal  language  that  helps  in  instantiating  model  checking 

60  clauses.  It  was  first  proposed  in  (Pnueli,  1977)  as  a  tool  for  formal  verification  of  computer  programs.  The  advantage  of  using 

61  Linear  Temporal  Logic  in  modeling  surveillance  videos  lies  in  the  fact  that  each  video  frame  can  be  shown  to  be  logically  related 

62  to  the  previous  and  next  frames  with  relations  that  can  be  represented  in  the  temporal  domain.  The  clauses  of  LTL  used  in  this 

63  paper  are: 

64  X  ->  holds  at  the  next  instant 

65  G  ->  holds  on  the  entire  subsequent  path 

66  F  ->  eventually  has  to  hold  (somewhere  on  the  subsequent  path) 

67  An  object’s  spatial  location  is  marked  by  the  2-tuple  (x,y)  representing  the  pixel  coordinates  of  its  centroid. 

68  3.  Mapping  surveillance  videos  to  LTL 

69  The  first  step  in  our  approach  is  to  map  the  surveillance  video  frames  to  Linear  Temporal  Logic.  This  requires  developing  a 

70  mechanism  to  represent  the  entities  and  actions  in  the  formal  language  of  LTL. 

71  3.1  Symbols  used  to  represent  the  real-world  entities 

72  O  ->  {Oi,  02,  . . .  On}  represents  the  various  objects  that  are  considered  part  of  the  foreground. 

73  O  E  {C}  U  {H}  U  {A}  where  C  represents  the  set  of  cars,  H  for  humans  and  A  for  animals. 


74  L  ->  {L|,  L2, . . .,  Ln}  represents  the  object  locations. 


75  V  {Vi,  V2,  Vn}  represents  the  velocities  of  the  corresponding  objects  quantified  with  the  help  of  the  optical  flows  (Lucas 

76  and  Kanade,  1981). 

77  3.2  Atomic  Propositions 

78  isAt(tj,  Oj,  Lk)  ->  Object  Oj  is  at  location  Lk  at  time  instant  t,  where  t;  belongs  to  the  finite  domain. 

79  isClose(  j)  ->  Entities  ;  and  j  are  in  close  proximity  to  each  other,  defined  by  a  threshold  x  (close  proximity  is  defined  in 

80  terms  of  the  unit  in  which  the  entities  are  defined)  which  may  be  Euclidean  distance,  appearance  labels,  or  just  the  magnitude. 

81  isLinear(Vi)  ->  Object  0;  has  a  velocity  V;  that  is  linear  for  a  certain  period  of  time  within  a  pre-defined  threshold. 

82  Mag(Vi)  ->  Magnitude  of  the  velocity  of  Object  O;. 

83  3.3  Integrity  Constraints 

84  Each  frame  represents  a  time  instant  t;.  An  object  cannot  be  present  simultaneously  at  two  locations  in  the  same  frame.  This  can 

85  be  represented  mathematically  as: 

86  isAt(ti,  Oj,  Lk)  a  isAt(tj,  Oj,  Lm)  =>  Lk  »  Lm  ...  (1) 

87  3.4  Complex  events  represented  as  a  combination  of  composite  atomic  propositions 

88 

89  3.4.1  Occlusion  (Event  E0  : 

90  Occlusion  occurs  if  at  time  t;,  Object  Oj  is  at  location  Lk  and  at  the  next  instant,  the  object  is  not  visible  at  any  location  Lk  close  to 

91  L, 

92  E]->  isAt(t;,  Oj,  Lk)  a  G  ([V  j]:  isClose(Lj,  Lk)  a  -1  isAt(ti+,  Oj,  Lj)  a  t,+  =>  X  t;)  ...  (2) 

93 

94  3.4.2  Human  entering  a  vehicle  (Event  E2): 

95 

96  A  human  entering  a  vehicle  is  detected  at  time  tj  if  an  Object  O;  at  location  Lk  belongs  to  the  set  of  humans  while  there  exists 

97  another  object  Oj  close  to  it  that  belongs  to  the  set  of  cars,  and  at  the  next  instant  of  time,  the  human  is  not  visible  near  the 

98  previous  location. 

99 

100  E2  ->  isAt(tp  ,  O;,  Lr)  a  isAt(tp  ,  Oj,  Lk)  a  (O;  £  H)  a  (Oj  £  C)  a  isClose(Lj,  Lk)  a  [V  m  :  isClose(Lm,  Lr)  a  isAt(tp+,  O;,  Lm)]  a 

101  tp+^Xtp  ...(3) 

102 


103 


3.4.3  Burglary  or  escapade  (Event  E3): 


1 04  Burglary  or  escapade  is  a  composite  event  detected  when  one  or  more  of  the  aforementioned  events  occur  in  the  course  of  time 

105  with  other  atomic  events  of  interest  like  carrying  an  object  and  velocity  of  cars  and  humans  exceeding  a  threshold. 

106 

107  E3  ^  O,  6  H  a  (Mag(Vi)  >  Threshold  T,)aH0  detected  a  E2  a  X  (Oj  E  C  )  a  F  (Mag(Vj)  >  Threshold  T2)  . .  .(4) 

108 

109 

110  where, 

111  T[  ->  Threshold  for  Human  velocity 

112  T2  Threshold  for  car  velocity 

113  H0  ->  Human  carrying  object 

114  4.  Abductive  Reasoning 

115  Abduction  is  a  logical  reasoning  framework  first  proposed  in  (Pierce,  1901).  In  abduction,  an  explanation  a  for  an 

116  observation  b  is  derived  by  presuming  that  a  may  be  true  because  then  b  would  eventually  follow.  Thus,  to  abduce  a  from  b 

117  involves  determining  that  the  occurrence  of  a  is  sufficient  (or  nearly  sufficient)  for  the  eventual  occurrence  of  b,  but  not 

118  necessary  for  b  to  occur. 

119  Given  a  theory  T  (in  LTL)  describing  normal/abnormal  behavior  in  an  environment  and  a  set  of  observations  O,  an  abduction 

120  engine  computes  a  set  E  of  LTL  formulas  that  form  possible  explanations  for  O  and  is  consistent  with  T.  A  probability 

121  distribution  on  the  set  E  (also  called  a  belief  state)  is  used  to  determine  the  most  likely  explanation.  Technically,  E  is  a  minimal 

122  set  of  LTL  formulas  that  together  with  T  entails  O;  i.e.,  Ta  E  |=  O. 

123  Here,  we  assume  a  Bayesian  framework  with  prior  probabilities  wherein  we  first  determine  the  prior  probabilities  of  all 

124  actions  A;  that  can  eventually  lead  to  a  particular  observation  O  and  choose  the  A;  with  maximum  apriori  probability. 

125  While  the  LTL -based  framework  in  Section  3  provides  a  deterministic  plan  recognition  technique  that  is  not  flexible  enough  to 

126  incorporate  probability  distributions  of  the  various  apriori  events,  in  most  real-world  scenarios,  the  atomic  propositions  are 

127  associated  with  probabilities  provided  either  by  the  sensors  or  by  the  tracking/atomic  action  recognition  system.  This  enables  us 

128  to  combine  logical  abduction  with  Bayesian  inference  to  determine  the  most  probable  top-level  plan. 

129  4.1  Example  cases  where  probabilistic  reasoning  might  help 

130 

131  4.1.1  Burglary  or  escapade: 

132  In  the  example  of  burglary  or  escapade  in  the  previous  page,  in  the  deterministic  case  we  just  consider  the  velocities  of  the 


133  human  being  entering  the  car  and  the  velocity  of  the  car  henceforth.  However,  a  great  determining  factor  is  the  location  of  the 

134  incident.  So,  once  again  like  the  previous  example,  by  matching  the  label  on  the  ROI  (Region  of  Interest)  around  the  scene 

135  against  a  database  of  standard  locations,  we  try  to  figure  out  if  the  point  is  a  bank  or  jewelry  or  an  antique  shop  because  these 

136  places  have  a  higher  probability  of  witnessing  a  burglary  than  other  places. 

137 

138  4.1.2  Filling  up  tracks  under  occlusion: 

139  Both  humans  and  cars  could  be  occluded  during  tracking.  For  instance,  humans  could  be  occluded  by  a  tree  or  a  building. 

140  Similarly,  moving  cars  could  also  be  occluded  by  a  tree  or  another  car.  So,  we  construct  a  map  of  the  respective  objects  based  on 

141  their  speeds  and  appearance.  The  ones  having  closest  speeds  and  closest  in  terms  of  appearance  while  going  into  occlusion  and 

142  reappearing  have  highest  probabilities  of  being  identified  as  the  same  object. 

143 

144  4.1.3  Filling  up  tracks  on  vehicles  that  might  have  remained  stationary  for  arbitrary  periods  of  time: 

145  Suppose  a  car  comes  to  a  standstill  at  a  point.  We  can’t  keep  tracking  it  forever.  So,  matching  the  label  on  the  ROI  around  the 

146  car  against  a  database  of  standard  locations,  we  try  to  figure  out  if  the  point  is  a  traffic  signal  or  a  parking  lot.  There’s  a  high 

147  probability  of  a  car  waiting  temporarily  at  a  signal  or  permanently  stopping  at  a  parking  lot. 

148 

149  4.2  Probabilistic  reasoning  to  perform  abduction 

150  The  use  of  conditional  probabilities  to  perform  probabilistic  Horn  abduction  was  proposed  in  (Poole,  1993).  Probabilistic  Horn 

151  Abduction  is  a  framework  for  integrating  probabilistic  and  logical  reasoning  into  a  coherent  practical  framework.  We  use  this 

152  same  idea  in  our  paper  but  use  an  altogether  different  approach  by  performing  probabilistic  reasoning  on  the  Linear  Temporal 

153  Logic  formulas  defined  in  Section  3. 

154 

155  Case  1:  Burglary  or  escapade 

156  Let  us  denote  a  bank  by  the  label  B  and  an  antique  shop  or  Jewelry  shop  by  AS.  So,  the  probability  that  the  event  E  is  a  burglary 

157  or  escapade  is  given  by 

158  P(E=Burglary/Escapade)  =  P(F  (isAt(t;,  L„  B)  v  isAt(t;,  L;,  AS)))  a  P(E3)  ...  (5) 

159  Here,  P(F  (isAt(t„  L„  B)))  =  dist( L;-  PL)  and  P(F  (isAt(t„  L„  AS)))  =  dist( Lr  AS) 

160  Also,  E3  denotes  the  deterministic  event  presented  earlier  in  equation  4  and  F  denotes  the  eventually  clause  in  LTL. 

161  A  careful  investigation  into  the  above  equation  yields  the  unknowns  P(Mag(Vi)  >  Threshold  Tj  )  and  P(H0  detected)  that  are  yet 

162  to  be  defined. 


163  We  define  them  as  follows: 


164  P(Mag(V;)  >  Threshold  Ti  )  =  1,  when  Mag(V;)  >  Threshold  T] 

165  =  0,  otherwise. 

166  And,  P(H0  detected)  is  obtained  from  the  template  matching  algorithm  that  yields  both  the  appearance  labels  discussed  before  as 

167  well  as  the  human  carrying  object. 

168 

169  Case  2:  Filling  up  tracks  under  occlusion 

170  Suppose  at  the  instant  an  object  (3,  is  last  seen  before  being  occluded,  it  has  velocity  u  and  appearance  label  al;.  For  each  object 

171  that  has  a  velocity  u  £  U  and  appearance  label  al  £  Al,  after  coming  out  from  occlusion  has  a  velocity  v  £  V  and  corresponding 

172  appearance  label  a2  £  A2.  Also  let  us  define  the  track  join  operator  as  A. 

173  So,  T;  A  Tj  ->  joining  tracks  T;  and  Tj.  Hence, 

174  P(T;  A  Tj)  =  (P(F  (isClose(Uj,  Vj))))  a  (P(F  (isClose(ali,  a2j))))  [V  i,j:  Uj  £  U  and  Vj  £  V  and  al;  £  Al  and  a2j  £  A2] 

175  ...(6) 

176  Here,  P(F  (isCloseju;,  Vj)))  =  min  (|uj-Vj|)/  (|Ui-Vj|) 

177 

178  And,  P(F  (isClose(ali,  a2j)))  =  min  (|alra2j|)/  (|al;-a2j|)  [V  i,j:  u,  £  U  and  Vj  £  V  and  al ;  £  Al  and  a2j  £  A2] 

179  The  above  equation  uses  normalization  to  ensure  that  the  probability  always  remains  less  than  or  equal  to  1  as  well  as  the  fact 

180  that  closer  the  velocities  of  each  object  higher  the  probability  of  track  joining.  The  operator  F  denotes  the  eventual  modality 

181  defined  in  LTL. 

182 

183  Case  3:  Filling  up  tracks  on  vehicles  that  might  have  remained  stationary  for  arbitrary  periods  of  time 

1 84  As  illustrated  earlier,  a  location  is  marked  as  L,.  Also,  let  us  denote  a  parking  lot  by  the  label  PL  and  a  traffic  signal  or  level- 

185  crossing  as  S.  Also,  let  us  denote  the  wait  time  for  keeping  the  tracks  active  on  a  stationary  vehicle  as  5t.  So,  for  a  parking  lot, 

186  P(5t  =  ttemp  )  =  P(F  (isClose(Vi,  0)  A  isAt(t„  L„  PL)))  ...  (7) 

187  And  for  traffic  signal, 

188  P(§t  =  tperm)  =  P(F(isClose(Vi,0) AisAt(ti,Li,S)))  ...  (8) 

189 

190  Where,  P(F  (isClose(Vj,  0)))=  1  /  Tv,!  and  P(F  (isAt(t;,  L;,  PL)))  =  \/dist{h  -  PL),  provided  V;  and  dist(L  -  PL)doesn’t  fall  beneath 

191  1  and  dist(L\-  PL)  denotes  the  distance  between  L,  and  PL.  So  as  L;  approaches  PL  the  probability  measure  keeps  increasing. 


192  Also,  ttemp  and  tperm  denote  the  user-defined  constants  representing  the  temporary  and  permanent  waiting  times  for  a  vehicle  and  F 

193  denotes  the  eventual  modality  in  LTL. 


5.  Chaining  the  events  by  mapping  them  into  a  Bayesian  Framework 

A  Bayesian  network  (also  known  as  a  belief  network  or  probabilistic  causal  network)  captures  believed  relations  (which  may 
be  uncertain,  stochastic,  or  imprecise)  between  a  set  of  variables,  which  are  relevant  to  some  problem.  They  might  be  relevant 
because  they  will  be  observable,  because  their  value  is  needed  to  take  some  action  or  report  some  result,  or  because  they  are 
intermediate  or  internal  variables  that  help  express  the  relationships  between  the  rest  of  the  variables. 

Each  node  in  a  Bayesian  Network  represents  a  scalar  variable  which  may  be  discrete,  continuous  or  propositional.  Once  the 
nodes  are  abstmcted  out,  they  are  connected  together  by  directed  links.  Each  node  has  an  associated  probability  vector  with  it. 
The  number  of  elements  in  this  vector  depends  upon  the  number  of  nodes  that  the  current  node  depends  on.  So,  if  the  current 
node  is  dependent  upon  say  one  node  then,  the  vector  has  four  elements  -  representing  the  cases  where  the  previous  node  and  the 
current  node  are  true-true,  true-false,  false-true  and  false-false  respectively.  Similarly,  a  node  dependent  on  two  previous  nodes 
may  be  shown  to  have  a  probability  vector  of  length  eight.  An  example  Bayes  net  from  our  implementation  has  been  pictured  in 
Fig  1.  Each  node  has  a  probability  vector  associated  with  it. 

For  instance,  Probability  that  velocity  of  human  is  greater  than  threshold  is  given  by  the  vector 

[(1~ b)  (  "T  human —^min_human  )  /  ( ^max_huaan-^min_human )  b  ( Vmax^uman  —  T  human )  /  (^max_human  —  ^min_human)  ]•  Flere,  5  IS  3  pre¬ 
defined  constant  that  determines  the  hardness  of  assumption.  Flence,  a  value  of  0.2  for  8  means  that  even  if  an  event  Ei  is  false, 
another  event  E2  that  depends  upon  it  has  a  probability  of  20  percent  of  being  true  and  hence  has  a  chance  of  80  percent  of 
being  false.  Flence  each  element  of  the  probability  vector  of  E2  that  depends  upon  Ei  is  the  conditional  probability  of  E2  with 


respect  to  E^  The  probability  vectors  are  determined  by  an  expert  and  later  updated  based  on  incoming  data. 
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Fig  1.  The  Inference  Engine  for  Burglary  detection 


6.  Implementation 

The  action  recognition  module  proposed  in  this  paper  uses  our  tracking'and  motion  classification  modules1 2.  The  plan  recognition 
module  produces  significantly  accurate  results  distinguishing  between  car  stops  at  an  intersection  and  a  parking  lot.  It  is  also  able 
to  track  a  car  again  after  an  occlusion  by  linking  the  tracks  using  appearance  and  velocity  labels  through  our  inference  engine. 
Our  module  can  also  distinguish  a  normal  visit  to  a  store  from  that  of  a  burglary/escapade. 

Fig  2,  3  and  4  portray  cases  of  burglary  detection  in  videos  obtained  from  surveillance  cameras.  Fig  5  and  6  demonstrate  the 
effectiveness  of  our  approach  in  the  case  of  occlusion  for  human  and  car  respectively.  Fig  7  shows  a  temporary  car  stop  at  an 
intersection  while  Fig  8  shows  a  permanent  stop  at  a  parking  lot.  The  videos  used  for  the  experiments  were  obtained  from  public 
datasets  like  VIRAT  and  Youtube. 


1  https://xythos.lsu.edu/users/mstagg3/web/tracker. 

2  Provided  along  with  the  supplementary  materials 
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Burglary 
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232 

233  Fig  2.  Vcar  >  Threshold  and  Vhuman  >  Threshold  and  Human  carrying  object  detected  and  L,  in  front  of  store  and  Human  entering 

234  and  exiting  building  detected,  so,  probable  burglary  detected. 
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Fig  3.  Human  velocity  greater  than  threshold  and  Human  carrying  object  detected  inside  store,  so,  probable  burglary  alert 
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Fig  4.  Vcar  >  Threshold  and  Vhuman  Threshold  and  Human  exiting  building  and  escaping  on  a  car  was  detected,  so,  probable 
burglary  detected. 


Occlusion 


Fig  5.  Tracking  a  human  (on  the  side  walk)  through  occlusion  by  matching  appearance  and  velocity  labels 


252 

253  Fig  6.  Velocities  and  appearance  labels  are  roughly  similar  for  the  same  car  that  proves  greater  likelihood  of  track  merging  after 

254  occlusion. 
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257  Intersection 
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260  Fig  7.  Cars  at  an  intersection.  Waiting  time  for  tracker  5t  =  ttemp 
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263  Parking  Lot 
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Fig  8.  A  car  at  a  parking  lot.  Waiting  time  for  tracker  5t  =  tperm 


7.  Conclusions 

Our  approach  to  high  level  action  recognition  using  LTL  based  abductive  reasoning  provides  a  novel  approach  in  identifying 
complex  events  like  burglary  or  escapade  in  surveillance  videos.  The  use  of  Linear  Temporal  Logic  ensures  in  accounting  for  the 
temporal  modalities  between  the  successive  frames,  whereas,  abductive  reasoning  through  the  integration  of  probabilistic  and 
logical  reasoning  frameworks  as  proposed  in  (Poole,  1993)  proves  to  be  a  useful  tool  in  reasoning  about  the  various  complex 
real-life  events  that  are  otherwise  impossible  to  detect  in  existing  automated  implementations. 

Currently  we  are  working  on  integrating  the  ideas  proposed  in  this  paper  to  develop  an  ensemble  learning  framework  that  can 
automatically  detect  the  top-level  plans  associated  with  a  wide-range  of  suspicious  activities  in  surveillance  videos. 
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